智能论文笔记

MDPose: Human Skeletal Motion Reconstruction Using WiFi Micro-Doppler Signatures

Chong Tang , Wenda Li , Shelly Vishwakarma , Fangzhan Shi , Simon Julier , Kevin Chetty

分类：计算机视觉

2022-01-11

基于光学传感器的运动跟踪系统通常遭受问题，例如差的照明条件，遮挡，有限的覆盖，并且可以提高隐私问题。最近，已经出现了使用商业WiFi设备的基于射频（RF）的方法，这些方法提供了低成本的普遍感感知，同时保留隐私。然而，RF感测系统的输出，例如范围多普勒谱图，不能直观地代表人类运动，并且通常需要进一步处理。在本研究中，提出了基于WiFi微多普勒签名的人类骨骼运动重建的新颖框架。它提供了一种有效的解决方案，通过重建具有17个关键点的骨架模型来跟踪人类活动，这可以帮助以更易于理解的方式解释传统的RF感测输出。具体地，MDPose具有各种增量阶段来逐渐地解决一系列挑战：首先，实现去噪算法以去除可能影响特征提取的任何不需要的噪声，并增强弱多普勒签名。其次，应用卷积神经网络（CNN）-Recurrent神经网络（RNN）架构用于从清洁微多普勒签名和恢复关键点的速度信息学习时间空间依赖性。最后，采用姿势优化机制来估计骨架的初始状态并限制误差的增加。我们在各种环境中使用了许多受试者进行了全面的测试，其中许多受试者具有单个接收器雷达系统，以展示MDPOST的性能，并在所有关键点位置报告29.4mm的绝对误差，这优于最先进的RF-基于姿势估计系统。

translated by 谷歌翻译

An Investigation of Indian Native Language Phonemic Influences on L2 English Pronunciations

Shelly Jain , Priyanshi Pal , Anil Vuppala , Prasanta Ghosh , Chiranjeevi Yarra

分类：自然语言处理

2022-12-19

Speech systems are sensitive to accent variations. This is especially challenging in the Indian context, with an abundance of languages but a dearth of linguistic studies characterising pronunciation variations. The growing number of L2 English speakers in India reinforces the need to study accents and L1-L2 interactions. We investigate the accents of Indian English (IE) speakers and report in detail our observations, both specific and common to all regions. In particular, we observe the phonemic variations and phonotactics occurring in the speakers' native languages and apply this to their English pronunciations. We demonstrate the influence of 18 Indian languages on IE by comparing the native language pronunciations with IE pronunciations obtained jointly from existing literature studies and phonetically annotated speech of 80 speakers. Consequently, we are able to validate the intuitions of Indian language influences on IE pronunciations by justifying pronunciation rules from the perspective of Indian language phonology. We obtain a comprehensive description in terms of universal and region-specific characteristics of IE, which facilitates accent conversion and adaptation of existing ASR and TTS systems to different Indian accents.

translated by 谷歌翻译

Transformer-Based Named Entity Recognition for French Using Adversarial Adaptation to Similar Domain Corpora

Arjun Choudhry , Pankaj Gupta , Inder Khatri , Aaryan Gupta , Maxime Nicol , Marie-Jean Meurs , Dinesh Kumar Vishwakarma

分类：自然语言处理

2022-12-05

Named Entity Recognition (NER) involves the identification and classification of named entities in unstructured text into predefined classes. NER in languages with limited resources, like French, is still an open problem due to the lack of large, robust, labelled datasets. In this paper, we propose a transformer-based NER approach for French using adversarial adaptation to similar domain or general corpora for improved feature extraction and better generalization. We evaluate our approach on three labelled datasets and show that our adaptation framework outperforms the corresponding non-adaptive models for various combinations of transformer models, source datasets and target corpora.

translated by 谷歌翻译

An Emotion-Aware Multi-Task Approach to Fake News and Rumour Detection using Transfer Learning

Arjun Choudhry , Inder Khatri , Minni Jain , Dinesh Kumar Vishwakarma

分类：自然语言处理 | 机器学习

2022-11-22

Social networking sites, blogs, and online articles are instant sources of news for internet users globally. However, in the absence of strict regulations mandating the genuineness of every text on social media, it is probable that some of these texts are fake news or rumours. Their deceptive nature and ability to propagate instantly can have an adverse effect on society. This necessitates the need for more effective detection of fake news and rumours on the web. In this work, we annotate four fake news detection and rumour detection datasets with their emotion class labels using transfer learning. We show the correlation between the legitimacy of a text with its intrinsic emotion for fake news and rumour detection, and prove that even within the same emotion class, fake and real news are often represented differently, which can be used for improved feature extraction. Based on this, we propose a multi-task framework for fake news and rumour detection, predicting both the emotion and legitimacy of the text. We train a variety of deep learning models in single-task and multi-task settings for a more comprehensive comparison. We further analyze the performance of our multi-task approach for fake news detection in cross-domain settings to verify its efficacy for better generalization across datasets, and to verify that emotions act as a domain-independent feature. Experimental results verify that our multi-task models consistently outperform their single-task counterparts in terms of accuracy, precision, recall, and F1 score, both for in-domain and cross-domain settings. We also qualitatively analyze the difference in performance in single-task and multi-task learning models.

translated by 谷歌翻译

Modeling User Behavior With Interaction Networks for Spam Detection

Prabhat Agarwal , Manisha Srivastava , Vishwakarma Singh , Charles Rosenberg

分类：机器学习

2022-07-21

垃圾邮件是困扰网络规模的数字平台的一个严重问题，可促进用户内容创建和分发。它损害了平台的完整性，推荐和搜索等服务的性能以及整体业务。垃圾邮件发送者从事各种与非垃圾邮件发送者不同的虐待和回避行为。用户的复杂行为可以通过富含节点和边缘属性的异质图很好地表示。学会在网络尺度平台的图表中识别垃圾邮件发送者，因为其结构上的复杂性和大小。在本文中，我们提出了塞纳河（使用相互作用网络检测垃圾邮件检测），这是一个新的图形框架上的垃圾邮件检测模型。我们的图形同时捕获了丰富的用户的详细信息和行为，并可以在十亿个尺度的图表上学习。我们的模型考虑了邻域以及边缘类型和属性，从而使其可以捕获广泛的垃圾邮件发送者。塞纳河（Seine）经过数千万节点和数十亿个边缘的真实数据集的培训，获得了80％的召回率，并以1％的假阳性率获得了80％的召回率。塞纳河（Seine）在公共数据集上的最先进技术实现了可比的性能，同时务实可用于大规模生产系统。

translated by 谷歌翻译

Hunting Group Clues with Transformers for Social Group Activity Recognition

Masato Tamura , Rahul Vishwakarma , Ravigopal Vennelakanti

分类：计算机视觉

2022-07-12

本文介绍了社会团体活动识别的新框架。作为集团活动识别的一项扩展任务，社会群体活动识别需要识别多个子组活动并识别小组成员。大多数现有方法通过完善区域功能来解决这两个任务，然后将它们汇总到活动特征中。这样的启发式功能设计使特征的有效性易于不完整的人本地化，并无视场景上下文的重要性。此外，区域特征是识别小组成员的次优最佳选择，因为这些特征可能由该地区的人群主导并具有不同的语义。为了克服这些缺点，我们建议利用变形金刚中的注意力模块来产生有效的社会群体特征。我们的方法的设计方式使注意力模块识别，然后汇总与社会团体活动相关的特征，从而为每个社会群体产生一个有效的功能。小组成员信息嵌入到功能中，从而通过馈电网络访问。馈送网络的输出代表组，因此可以通过组和个人之间的简单匈牙利匹配来识别小组成员。实验结果表明，我们的方法优于排球和集体活动数据集的最先进方法。

translated by 谷歌翻译

PSP-HDRI$+$: A Synthetic Dataset Generator for Pre-Training of Human-Centric Computer Vision Models

Salehe Erfanian Ebadi , Saurav Dhakad , Sanjay Vishwakarma , Chunpu Wang , You-Cyuan Jhang , Maciek Chociej , Adam Crespi , Alex Thaman , Sujoy Ganguly

分类：计算机视觉 | 人工智能 | 机器学习

2022-07-11

我们介绍了一种新的合成数据生成器PSP-HDRI $+$，该$+$被证明是ImageNet和其他大规模合成数据对应物的卓越预训练替代方案。我们证明，使用合成数据的预训练将产生一个更通用的模型，即使在分布外（OOD）集测试时，该模型的性能也比替代方案更好。此外，使用由人关键点估计指标指导的消融研究，具有现成的模型架构，我们展示了如何操纵我们的合成数据生成器以进一步提高模型性能。

translated by 谷歌翻译

Study of Indian English Pronunciation Variabilities relative to Received Pronunciation

Priyanshi Pal , Shelly Jain , Anil Vuppala , Chiranjeevi Yarra , Prasanta Ghosh

分类：自然语言处理

2022-04-13

Analysis of Indian English (IE) pronunciation variabilities are useful in building systems for Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) synthesis in the Indian context. Typically, these pronunciation variabilities have been explored by comparing IE pronunciation with Received Pronunciation (RP). However, to explore these variabilities, it is required to have labelled pronunciation data at the phonetic level, which is scarce for IE. Moreover, versatility of IE stems from the influence of a large diversity of the speakers' mother tongues and demographic region differences. Prior linguistic works have characterised features of IE variabilities qualitatively by reporting phonetic rules that represent such variations relative to RP. The qualitative descriptions often lack quantitative descriptors and data-driven analysis of diverse IE pronunciation data to characterise IE on the phonetic level. To address these issues, in this work, we consider a corpus, Indic TIMIT, containing a large set of IE varieties from 80 speakers from various regions of India. We present an analysis to obtain the new set of phonetic rules representing IE pronunciation variabilities relative to RP in a data-driven manner. We do this using 15,974 phonetic transcriptions, of which 13,632 were obtained manually in addition to those part of the corpus. Furthermore, we validate the rules obtained from the analysis against the existing phonetic rules to identify the relevance of the obtained phonetic rules and test the efficacy of Grapheme-to-Phoneme (G2P) conversion developed based on the obtained rules considering Phoneme Error Rate (PER) as the metric for performance.

translated by 谷歌翻译

Locally Shifted Attention With Early Global Integration

Shelly Sheynin , Sagie Benaim , Adam Polyak , Lior Wolf

分类：计算机视觉 | 人工智能

2021-12-09

最近的工作表明了计算机视觉应用的变压器的潜力。第一图像首先分区，然后将其用作注意机制的输入令牌。由于注意机构的昂贵二次成本，使用大的贴片尺寸，导致粗糙的全局相互作用，或者，替代地，仅在图像的局部区域上施加注意力，以牺牲远程相互作用为代价。在这项工作中，我们提出了一种方法，该方法允许在视觉变压器的早期层上允许粗糙的全局相互作用和细粒局部相互作用。在我们的方法的核心，是应用本地和全球注意层的应用。在本地注意层中，我们对每个补丁及其本地移位进行注意，导致几乎位于本地补丁，这些修补程序不绑定到单个特定位置。然后在全球注意层中使用这些实际的补丁。注意层进入本地和全局对应物的分离允许在贴片的数量中进行低计算成本，同时仍然支持已经在第一层处的数据相关的本地化，而不是其他可视变压器中的静态定位。我们的方法被证明优于基于卷积和变压器的图像分类方法，用于CIFAR10，CIFAR100和Imagenet。代码可在：https://github.com/shellysheynin/locally-sag-transformer。

translated by 谷歌翻译

Universalizing Weak Supervision

Changho Shin , Winfred Li , Harit Vishwakarma , Nicholas Roberts , Frederic Sala

分类：机器学习 | 人工智能

2021-12-07

弱监督（WS）框架是一种绕过手工标记大型数据集的流行方式，用于培训数据饥饿的模型。这些方法综合了多种噪声，但更便宜地获得了对下游训练的一套高质量伪标签的标签。然而，合成技术特异于特定类型的标签，例如二元标记或序列，并且每种新标签类型需要手动设计新的合成算法。相反，我们提出了一种普遍的技术，它可以通过任何标签类型的弱监管，同时仍提供所需的性质，包括实际灵活性，计算效率和理论保证。我们将这种技术应用于以前不被WS框架解决的重要问题，包括学习在双曲线歧管中的排名，回归和学习。从理论上讲，我们的合成方法产生一致的估计，用于学习挑战但是指数家庭模型的重要概括。通过实验，我们验证了我们的框架，并在不同的环境中显示了基础的基准，包括真实的学习 - 排名和回归问题以及学习在双曲线歧管上。

translated by 谷歌翻译